Overview

Dataset Statistics

Number of Variables 11
Number of Rows 1.0838e+06
Missing Cells 9
Missing Cells (%) 0.0%
Duplicate Rows 2
Duplicate Rows (%) 0.0%
Total Size in Memory 454.9 MB
Average Row Size in Memory 440.1 B
Variable Types
  • Categorical: 7
  • Numerical: 4

Dataset Insights

Amount Received and Amount Paid have similar distributions Similar Distribution
From Bank is skewed Skewed
To Bank is skewed Skewed
Amount Received is skewed Skewed
Amount Paid is skewed Skewed
Timestamp has a high cardinality: 1717 distinct values High Cardinality
Account has a high cardinality: 415012 distinct values High Cardinality
Account.1 has a high cardinality: 405075 distinct values High Cardinality
Timestamp has constant length 16 Constant Length
Account has constant length 9 Constant Length
Account.1 has constant length 9 Constant Length
Is Laundering has constant length 3 Constant Length
  • 1
  • 2

Variables


Timestamp

categorical

Approximate Distinct Count 1717
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory Size 87787476

Length

Mean 16
Standard Deviation 0
Median 16
Minimum 16
Maximum 16

Sample

1st row 2022/09/01 00:20
2nd row 2022/09/01 00:20
3rd row 2022/09/01 00:00
4th row 2022/09/01 00:02
5th row 2022/09/01 00:06

Letter

Count 0
Lowercase Letter 0
Space Separator 1083796
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 13005552
  • The largest value (20220901) is over 96.78 times larger than the second largest value (0004)
  • Timestamp has words of constant length

From Bank

numerical

Approximate Distinct Count 19247
Approximate Unique (%) 1.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 17340736
Mean 58573.0445
Minimum 0
Maximum 356300
Zeros 1
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • From Bank is skewed right (γ1 = 1.6543)

Quantile Statistics

Minimum 0
5-th Percentile 12
Q1 1241
Median 15010
Q3 111433
95-th Percentile 248314
Maximum 356300
Range 356300
IQR 110192

Descriptive Statistics

Mean 58573.0445
Standard Deviation 90794.0779
Variance 8.2436e+09
Sum 6.3481e+10
Skewness 1.6543
Kurtosis 1.5402
Coefficient of Variation 1.5501
  • From Bank is not normally distributed (p-value 8.930006450578425e-22)
  • From Bank has 42578 outliers

Account

categorical

Approximate Distinct Count 415012
Approximate Unique (%) 38.3%
Missing 1
Missing (%) 0.0%
Memory Size 80200830
  • The largest value (100428660) is over 1.66 times larger than the second largest value (1004286A8)

Length

Mean 9
Standard Deviation 0
Median 9
Minimum 9
Maximum 9

Sample

1st row 8000EBD30
2nd row 8000F4580
3rd row 8000F4670
4th row 8000F5030
5th row 8000F5200

Letter

Count 2179670
Lowercase Letter 0
Space Separator 0
Uppercase Letter 2179670
Dash Punctuation 0
Decimal Number 7574485
  • The largest value (100428660) is over 1.66 times larger than the second largest value (1004286a8)
  • Account has words of constant length

To Bank

numerical

Approximate Distinct Count 15159
Approximate Unique (%) 1.4%
Missing 1
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 17340720
Mean 70747.6483
Minimum 1
Maximum 356294
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • To Bank is skewed right (γ1 = 1.2461)

Quantile Statistics

Minimum 1
5-th Percentile 15
Q1 5425
Median 22345
Q3 126426
95-th Percentile 243947
Maximum 356294
Range 356293
IQR 121001

Descriptive Statistics

Mean 70747.6483
Standard Deviation 90822.9339
Variance 8.2488e+09
Sum 7.6676e+10
Skewness 1.2461
Kurtosis 0.2942
Coefficient of Variation 1.2838
  • To Bank is not normally distributed (p-value 2.151757069556257e-17)
  • To Bank has 28411 outliers

Account.1

categorical

Approximate Distinct Count 405075
Approximate Unique (%) 37.4%
Missing 1
Missing (%) 0.0%
Memory Size 80200830

Length

Mean 9
Standard Deviation 0
Median 9
Minimum 9
Maximum 9

Sample

1st row 8000EBD30
2nd row 8000F5340
3rd row 8000F4670
4th row 8000F5030
5th row 8000F5200

Letter

Count 2335593
Lowercase Letter 0
Space Separator 0
Uppercase Letter 2335593
Dash Punctuation 0
Decimal Number 7418562
  • Account.1 has words of constant length

Amount Received

numerical

Approximate Distinct Count 668814
Approximate Unique (%) 61.7%
Missing 1
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 17340720
Mean 7.0183e+06
Minimum 1e-06
Maximum 2.8438e+11
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Amount Received is skewed right (γ1 = 300.7093)

Quantile Statistics

Minimum 1e-06
5-th Percentile 5.05
Q1 104.48
Median 1945.46
Q3 25518.812
95-th Percentile 1.7494e+06
Maximum 2.8438e+11
Range 2.8438e+11
IQR 25414.332

Descriptive Statistics

Mean 7.0183e+06
Standard Deviation 5.9381e+08
Variance 3.5261e+17
Sum 7.6064e+12
Skewness 300.7093
Kurtosis 118481.6344
Coefficient of Variation 84.6089
  • Amount Received is not normally distributed (p-value 4.226515706443173e-25)
  • Amount Received has 201544 outliers

Receiving Currency

categorical

Approximate Distinct Count 15
Approximate Unique (%) 0.0%
Missing 1
Missing (%) 0.0%
Memory Size 78651029
  • The largest value (US Dollar) is over 1.61 times larger than the second largest value (Euro)

Length

Mean 7.57
Standard Deviation 3.2684
Median 9
Minimum 3
Maximum 17

Sample

1st row US Dollar
2nd row US Dollar
3rd row US Dollar
4th row US Dollar
5th row US Dollar

Letter

Count 7596200
Lowercase Letter 5462854
Space Separator 608154
Uppercase Letter 2133346
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (US Dollar, Euro) take over 50.0%

Amount Paid

numerical

Approximate Distinct Count 673936
Approximate Unique (%) 62.2%
Missing 1
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 17340720
Mean 5.8112e+06
Minimum 1e-06
Maximum 2.8438e+11
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Amount Paid is skewed right (γ1 = 340.3944)

Quantile Statistics

Minimum 1e-06
5-th Percentile 5.1
Q1 105.4156
Median 1945.4496
Q3 25552.8016
95-th Percentile 1.6967e+06
Maximum 2.8438e+11
Range 2.8438e+11
IQR 25447.386

Descriptive Statistics

Mean 5.8112e+06
Standard Deviation 4.6209e+08
Variance 2.1352e+17
Sum 6.2982e+12
Skewness 340.3944
Kurtosis 165114.0913
Coefficient of Variation 79.5164
  • Amount Paid is not normally distributed (p-value 4.226515011974823e-25)
  • Amount Paid has 200973 outliers

Payment Currency

categorical

Approximate Distinct Count 15
Approximate Unique (%) 0.0%
Missing 1
Missing (%) 0.0%
Memory Size 78649121
  • The largest value (US Dollar) is over 1.62 times larger than the second largest value (Euro)

Length

Mean 7.5683
Standard Deviation 3.2633
Median 9
Minimum 3
Maximum 17

Sample

1st row US Dollar
2nd row US Dollar
3rd row US Dollar
4th row US Dollar
5th row US Dollar

Letter

Count 7593424
Lowercase Letter 5457178
Space Separator 609022
Uppercase Letter 2136246
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (US Dollar, Euro) take over 50.0%

Payment Format

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.0%
Missing 1
Missing (%) 0.0%
Memory Size 80182156
  • The largest value (Reinvestment) is over 1.9 times larger than the second largest value (Cheque)

Length

Mean 8.9828
Standard Deviation 3.3952
Median 12
Minimum 3
Maximum 12

Sample

1st row Reinvestment
2nd row Cheque
3rd row Reinvestment
4th row Reinvestment
5th row Reinvestment

Letter

Count 9574224
Lowercase Letter 8160788
Space Separator 161257
Uppercase Letter 1413436
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Reinvestment, Cheque) take over 50.0%
  • The largest value (reinvestment) is over 1.9 times larger than the second largest value (cheque)

Is Laundering

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 1
Missing (%) 0.0%
Memory Size 73698060
  • The largest value (0.0) is over 1705.76 times larger than the second largest value (1.0)

Length

Mean 3
Standard Deviation 0
Median 3
Minimum 3
Maximum 3

Sample

1st row 0.0
2nd row 0.0
3rd row 0.0
4th row 0.0
5th row 0.0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 2167590
  • The top 2 categories (0.0, 1.0) take over 50.0%
  • The largest value (00) is over 1705.76 times larger than the second largest value (10)
  • Is Laundering has words of constant length

Interactions

Correlations

Missing Values